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Abstract 

We give a short introduction to Pade approximation (rational approximation to a function 
with close contact at one point) and to Hermite-Pade approximation (simultaneous rational 
approximation to several functions with close contact at one point) and show how orthogonal- 
ity plays a crucial role. We give some insight into how logarithmic potential theory helps in 
describing the asymptotic behavior and the convergence properties of Pade and Hermite-Pade 
approximation. 
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1 Pade approximation 
1.1 Taylor polynomials 

The general setup in approximation theory is that a function / is given and that one wants to 
approximate it with a simpler function g but in such a way that the difference between / and g is 
small. The advantage is that the simpler function g can be handled without too many difficulties 
but the disadvantage is that one loses some information since / and g are different. 

In the setting of Pade approximation one starts with a function / : C — > C for which a Taylor 
expansion is known in the neighborhood of a given point a £ C, i.e., 

f(z) = jTc k (z-a) k , c k = ^f±. (1.1) 

A;=0 

The function / can not be computed exactly using this Taylor expansion since this requires an 
infinite number of additions (and multiplications). We can obtain a polynomial approximation by 
truncating after n terms. The corresponding approximations are Taylor polynomials given by 

n— 1 

f n {z) = Y,c k {z-a) k , (1.2) 

k=0 

and these Taylor polynomials are therefore characterized by 

f(z)-f n (z) = 0((z-a) n ), z^a. (1.3) 

This condition is a (confluent) interpolation condition which tells us that the difference f — f n has 
a zero of multiplicity n at the point a. We know an explicit formula for the Taylor polynomial, 
namely 

/*)=E^-«)', 

fc=0 

and the error is given by 

f(z)-f n (z) = f:^M(z-a)\ 

k=n 

If / is analytic in a domain f2 that contains a and if T is a closed contour in Q encircling a once in 
the positive direction (counterclockwise), then Cauchy's formula gives 



/«(a) 1 f f(0 



Iff 

k\ 2-ni J r (f - a) k+1 



and hence 



j_ / /(a [i ( z ~ a 



The error then becomes 
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The convergence of f n to / corresponds to the convergence of the Taylor series, and typically one 
has uniform convergence on closed disks \z — a\ < r , where r < p(f) and 

p(f) := sup{i? : / is analytic in \z — a\ < R} 

is the radius of convergence of the series in (|l,lj) . Indeed, if we choose e > such that r + e < p(f) 
and if we take for T the circle |£— a\ = r+e, then for \z— a\ < r we have from Q1.4JI by straightforward 
estimations 

I'M - '- Wl £ b-sj. {ttJ h L w^\- 

and since r/(r + e) < 1 we see that the right hand side converges to 0. So convergence is only 
guaranteed on disks with a radius less than the radius of convergence. The function / may be 
analytic in a larger domain (the radius of convergence depends on the singularity of / closest to 
a), but the Taylor approximation will not converge outside the disk with radius p(f). 



1.2 Pade approximants 

Polynomials are not such a good class of functions if one wants to approximate functions with 
singularities because polynomials are entire functions without singularities. They are only useful up 
to the first singularity of / near a. Rational functions are the simplest functions with singularities. 
The idea is that the poles of the rational functions will move to the singularities of the function /, 
and hence the domain of convergence could be enlarged, and singularities of / may be discovered 
using the poles of the rational approximants. 

The [m, n] Pade approximant of / in a is the rational function Q m /P n , with Q m a polynomial 
of degree < m and P n a polynomial of degree < n, for which we have the following interpolation 
condition at a: 

/w-l# = ^-«r +n+1 ). (i-B) 

The computation of the polynomials P n and Q m is not so easy from this interpolation condition, 
since one first has the compute the Taylor expansion of Q m /P n and then equate the first m + n + 1 
Taylor coefficients to the first m + n + 1 Taylor coefficients of /. Usually the Pade approximant is 
defined by linearizing the interpolation condition as 

Pn(z)f(z) - Q m (z) = 0((z - a) m+n+1 ), z - a. (1.6) 

For Pade approximation near infinity to a function of the form 

oo 
fc=0 

one takes m = n — 1 and the interpolation condition is 

P n (z)f(z) - Qn-^Z) = Oiz-^ 1 ), Z - OO, 

(see Section 1.3). There is a degree of freedom since we can multiply both sides of (|1.6j) by a 
constant. Usually we normalize this by taking P n monic (i.e., of the form x n + • • • ) when this is 
possible, and this can only be done if P n is of exact degree n. If we take P n monic, then we can 
determine the n unknown coefficients cifc (k = 1, . . . , n) in 

n 

P n {z) =: a k {z ~ a) n ~\ a = 1, (1.7) 
fe=0 
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by putting the coefficients of (z — a) k for k = m + l,m + 2, ...,m + n in the Taylor expansion of 
P n f equal to zero. The polynomial Q m then corresponds to the Taylor polynomial of degree m of 

Pnf- 

Here is another approach. Suppose / is analytic in a domain £1 that contains a. Again we 
take a contour T inside f2 encircling a once in the positive direction. Divide both sides of 1)1. 6j) by 
(z — a) m+k+2 and integrate, to find 



1 

2tH 



Pn(z)f(z) 



\m+k+2 



dz 



1 

2vri 



Qr 



z 



\m+k+2 



dz 



oo 



j'=m+n+l 



where the 6 n j's are the coefficients in the expansion of P n f — Q m around a. The integral involving 
Q m is zero for k > since it is proportional to the (m + k + l)th derivative of Qm, which is zero 
for k > 0. The sum on the right-hand side has a contribution only when j = m + k + 1, but when 
< < n — 1 then j < m + n and such indices do not appear in the sum. Hence the right hand 
side also vanishes for k < n — 1. Therefore (|1,6|) implies that 



1 



Pn(z) 



/(z)cte = 0, fc = 0, 1, . . . ,n — 1. 



If we use the expansion (|1.7() then this gives 



n 1 



j=0 



f(z)dz = 0, k = 0,1,2, 



1. 



If we use the expansion then 

1 

2vrf 

so we get the system of equations 



vn—j—m—k—2 



f(z) dz 



C-m— n+k+j+1 j 



^Cm— n+1 C m __ n -|-2 
Cm— n+2 Cm— n+3 



C m +l\ 

C m +2 



V 



Cm+1 



-"in+n 



( a o\ 

«1 



\ct n / 



M 





(1.8) 



There is one degree of freedom here since we have n + 1 unknowns and n (homogeneous) equations. 
The choice ao = 1 (if possible) gives the monic polynomial P n , but sometimes another normalization 
will be used, as we will see later. 



1.3 Orthogonality 

From now on we will only consider Pade approximants near infinity. This can easily be obtained 
from Pade approximation near zero and the change of variable z \— * 1/z. Indeed, if g has a Taylor 
expansion 

oo 
k=0 
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near the origin, then f(z) := g(l/z)/z as an expansion near infinity of the form 



/(*) = E 



Cfc 

yk + 1 ' 



(1.9) 



fc=0 



Since f(z) = 0(l/z), the only sensible choice of the degree in the rational approximation problem 
is to take m = n — 1 so that Q m (z)/P n (z) is also 0(1/ z). This situation occurs when / is of the 
form 

dfj,(x) 



z — X 



i.e., when / is the Stieltjes transform (or Cauchy transform) of a positive measure \i on the real 
line. The Pade approximants near infinity can be obtained from the Pade approximants near zero 
in the following way. The [n — 1, n] Pade approximant Q^ l _ 1 /P* for /* near has the interpolation 
condition 

P*{x)f(x) - Qn-l(x) = 0(x 2n ), X^O. 

Change variables by setting x = 1/z and divide both sides by z. Then 

P* n (l/z)f(z) - -Ql^{l/z) = Oiz- 2 ^ 1 ), z - oo. 
In order to get polynomials, we multiply both sides by z n . Then 

P n (z)f(z)-Q r ^ 1 (z) = 0(z- n - 1 ), z^oo, (1.10) 

where P n (z) := z n P*(l/z) and Q n _\(z) := z n ~ l Q* n _ l (l/ 1 z) are obtained by reversing the polyno- 
mials P* and Qn-i- So the interpolation conditions at infinity are given by (|l.l(Jj) . The system of 
equations (jl.8j) for /* and m = n — 1 then changes to the system 



/ co ci 
ci c 2 



\c n c n+ i 
for the unknown coefficients of 



Cn \ ( «o\ 
ai 



Cn+l 
C2n-\) 



\a n J 



M 
o 



(i.ii) 



Pn(z) := 



a k z 



k=0 



Typically we will not be given the function / but rather the infinite sequence of coefficients 
Q), ci, C2, • • • in the Laurent expansion of /. With this as input, we define a linear functional C on 
the linear space of polynomials by 



C(x % 



n = 0,1,2,. 



(1.12) 



For a polynomial p(x) = Ylk=o a k xk we then have by linearity C(p) = Ylk=o a k c k- If we now look 
at the system of equations Q1.11J1 . then the coefficients of P n satisfy the equations 



ajCk+j = 0, k = 0, 1, . . . , n — 1. 

j=0 
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But this is equivalent to saying that 

C{x k P n {x)) = 0, k = 0,1,... ,n- 1. (1.13) 

Hence the polynomial P n is orthogonal to all polynomials of degree less than n with respect to the 
linear functional C. A very useful normalization of P n is to require that in addition to (|1.13|) we 
also have 

£(P n 2 (x)) = 1. 

This can always be done when the functional is positive. When the functional is not positive, 
then one imposes the extra condition C(P^(x)) := h n ^ 0, so that P n l\fh^ has norm one. Once 
the polynomial P n is obtained, the remaining elements in the Pade approximation problem can be 
found explicitly in terms of P n . Indeed, if we define 

q .- 1 (z )! = £ ( f -w:f-w ), (i.i4) 

then, since [P n {z) — P n (x)]/(z — x) is a polynomial of degree n — 1 in the variable z, Q n -i is a 
polynomial of degree n — 1 and (|1.14j) is equivalent to 



P n {z)C (J—) - Q n ^(z) = C 

\z — X J \z — X J 



The functional C was only defined on polynomials, but if we expand \/{z — x) in a Laurent series, 
then (at least formally) 



(1 \ ( °° x k \ 00 

—)= c E^i J =EiSr = /(*)» 
\fe=0 / fc=0 



so what needs to be shown is that 



z — X 



Using the Laurent series of l/(z — x) we find 

v 7 fc=0 

and the orthogonality conditions (|1.13|) show that the terms with k < n — 1 vanish. The first term 
is therefore the term with k = n, which is 0{1/ z n+1 ). What we also learn from this proof is that 
the error in the Pade approximation problem is given explicitly by 

P n (z)f(z) - Q n ^(z) = C , (1.15) 

\z — X J 

which is again in terms of the polynomial P n . 
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1.4 Moment problem 

The linear functional C remains a bit mysterious. Obviously it is related to the function /, but 
we would like to know it somewhat more explicitly. The Riesz representation theorem tells us 
that every positive and bounded linear functional on the linear space of continuous functions with 
compact support on the real line can be represented by a finite positive measure \i on the real line 

as 

/oo 
f(x)dfi(x). 
-oo 

If we want to get convergence results for Pade approximation, then it would be convenient to work 
with a bounded and positive linear functional C, which is represented by a finite positive measure 
[i. In that case 

/oo 
x k dfj,(x) (1.16) 
-oo 

will be the moments of a positive measure /i and the function / is the Cauchy transform (Stieltjes 
transform) of the measure ^: 

f(z)= / -—dn{x). 

J —oo * 

Obviously not every infinite sequence cq, c%, c%, . . . will lead to a positive and bounded linear func- 
tional. The moment problem is to obtain conditions on this infinite sequence co, cx,P2, . . . guar- 
anteeing that they are the moments of a finite positive measure on the real line, as in (|1.16j) . If 
the measure is supported on (— oo, oo) then this is known as the Hamburger moment problem. 
If the measure is supported on the positive axis [0, oo) then we speak of the Stieltjes moment 
problem. If the measure is supported on a finite interval (usually [0, 1]), then this is known as the 
Hausdorff moment problem. A necessary and sufficient condition that the sequence co, c\, C2, . . . 
consist of moments of a positive measure on (— oo, oo) is that all the Hankel matrices 

/ co ci • • • c n \ 
ci c 2 • • • c n+ i 



\C n C n+ i • • • C2n / 

be positive definite. Observe that these are precisely the matrices appearing in Q1.11JI . 

From now on we will add one more restriction, namely that the measure be supported on a 
finite interval [a, b]. This simplifies our treatment by avoiding non-compactness of the support. So 
our function / will be a Markov function 



dfx(x), 



and such a function is analytic in C \ [a, b\. The singularities of this function therefore are located 
on the interval [a, b]. The linear functional in this case is given by 



C 



(g) = f g(x)dfi(x), 

J a 



for every continuous function g on [a, b}. The denominator polynomials in the Pade approximation 
problem are orthogonal polynomials for the measure /i on the interval [a, b], i.e., 



x k P n (x) dfi(x) 



0. 



A: 



0,1,... ,n 



1, 



(1.17) 
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which we normalize so that they are orthonormal 

j2 



P*(x)dfj,(x) = 1. (1.18) 



The numerator polynomials are given by 



and the error is given by 



<>„-.(--)- f — Pn{X) Mx), (1-19) 

z — X 



/>,S :.}/( :)-^,_,S:.) / ^#), (1.20) 



1.5 Zeros and poles 

The idea of using rational approximation is that the singularities of the Pade approximant would 
give an idea of the singularities of the function /. This is indeed so when / is a Markov function. 
The singularities of the Pade approximant are poles at the zeros of P n . A consequence of the 
orthogonality is that these zeros are simple and they all are on the open interval (a, b). 

Theorem 1.1. Suppose that the support of \i is an infinite set in [a,b]. Then all the zeros of P n 
are simple and located on (a, b) . 

Proof. Let x\, . . . ,x m be the sign changes of P n on (a,b), then obviously m < n, since each sign 
change is a zero. Suppose that m < n. Then introduce the polynomial n m (x) := (x — x\){x — 
X2) • • • (x — x m ). The function P n (x)ir m (x) does not change sign on [a,b] and since the support of 
H contains infinitely many points we conclude that 

b 

Pn(x)lT m (x) dfi(x) / 0. 

But P n is orthogonal to all polynomials of degree < n, hence this integral is equal to 0. This 
contradiction implies that m = n. So P n has n sign changes on (a, b), each a zero of P n , hence each 
a simple zero of P n , and P n has no other zeros. □ 

1.6 Convergence 

When we study the convergence of the Pade approximants, we use ()1.20(l to find 

ff z ) _ Qn -^ = ^_ t ^1 dlx ( x ) 

Pn{z) Pn 

(z) J a z-x 

Observe that 

-6 



Pn(z) t ^dv(x) 
J a 



Z — X J n Z — X 
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The fraction [P n {z) — P n {x)]/{z — x) is a polynomial of degree n — 1 in the variable x, so by 
orthogonality the first integral on the right vanishes. This gives 



Pn(z) [ b ^ld^x)= f b ^M d ^x), 
J a Z X J a Z X 



and the error in Pade approximation becomes 

_ Qn=M r b ?M Mx) (1 21) 

This error contains two parts: on the one hand it contains the polynomial P n for which we will 
describe the asymptotic behavior in the next subsection, and on the other hand it contains the 
integral 



J a 



■dfj,(x), 



which is in fact a Markov function for the probability measure P%(x) dfi{x) when P n is the orthonor- 
mal polynomial. We can estimate this integral as follows. Suppose that z belongs to a compact set 
K C C \ [a, b]. Then the distance dx between K and [a, b] 

dx = inf{|z — x\ : z G K,x £ [a,b]} 

is strictly positive. Therefore we have 



b D 2 



z — X 



dfj,(x) 



b p2 



and this bound is independent of n. So the convergence of the Pade approximants is completely 
determined by the asymptotic behavior of P n . 

1.7 Asymptotic properties 

In this subsection we describe the asymptotic behavior of \P n (z)\ 1 ^ n when z G K, where K is a 
compact subset of C \ [a, b\. If we denote the leading coefficient of P n by 7„ > and the zeros of 

Pn by Xl tTl < X 2 ,n <■■■ < ^n,n, then 

n 

Pn(z) =lnY[( z ~ x 3,n)- 

3=1 

The asymptotic behavior thus requires knowing the behavior of 7„ and the asymptotic distribution 
of the zeros. 

Let us first consider the asymptotic distribution of the zeros. Consider the discrete measure 



1 n 



n . 

3=1 



where 5 C is the Dirac measure with mass 1 at the point c. The measure v n describes the distribution 
of the zeros of P n . The asymptotic distribution corresponds to an investigation of the limit of this 
sequence of measures. All the zeros of P n are on the interval [a, b], so all the measures v n are 
probability measures on [a, b]. Helly's selection principle tells us that there will be a subsequence 
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that converges weakly to a probability measure v on [a, b]. This means that there is a subsequence 
(r^) such that 



lim / g{x)dv nk {x)= I g(x)du(x), 

k ^°° J a J a 

for every continuous function g on [a, b]. For the monic polynomial P n := P n /ln we have 
1 1 v— -\ 

-\og\P n (z)\ = ~y^\og\z - x j)U \= / log \z - x\ du n (x), 

n n f^ J a 

hence when z G K C C \ [a, b], then the weak convergence implies that 

lim \P nh (z)\^ n " =exp( f 

k-^oo \J a 

Next, the leading coefficient 7„ solves a minimization problem: 
Theorem 1.2. We have 



z — x\ dv{x) ] . 



f 



b 



min / \q n (x)\ z dv(x), (1.22) 
% q n (x)=x"+- J a 

and the minimum is attained at the monic orthogonal polynomial P n . 

Proof. We can write an arbitrary monic polynomial of degree n as q n = P n + vr n _i, where 7r„_i is 
a polynomial of degree < n — 1. We then have 



q n (x)\ 2 dfi(x) = / \P n (x)\ 2 dfi(x) + / |7r n _i(x)| 2 



r-6 

+ 2/ P n (x)7r n _i(x) 



The last integral vanishes because of orthogonality, so that 



min / \q n (x)\ dfi(x) = / |P n (^)| d/j,(x) + min / |7r n _i(x)| dfi(x). 

q n (x)=x n +-J a J a ^-iJa 

The minimum on the right hand side is obtained by taking 7r n _i = 0, so the minimum in ()1.22j) is 
obtained for the monic orthogonal polynomial. □ 

Without going to much into details, this extremal problem for 7„ will in fact tell us that the 
asymptotic behavior of 7n an d the asymptotic distribution of the zeros (the measure v) are 
described by an equilibrium problem for (logarithmic) potentials. There is a unique probability 
measure [i e on [a, b] that minimizes the logarithmic energy 

b r b j 

log | 1 da{x)da{y) 

F - y\ 

over all probability measures a supported on [a, b}. This measure is given by 

dfj, e (x) = = , x E [a, b] 

n y/(x - a)(b - x) 
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and has the property that its logarithmic potential satisfies 

U(x;iJ, e )=f log l _ dn e (y) = - log xe[a,b]. 
Ja \ x y\ 4 

This equilibrium measure corresponds to the measure v describing the asymptotic zero distri- 
bution when the orthogonality measure fi is sufficiently regular on [a, b\. A sufficient condition is 
that jj! > almost everywhere on [a,b] (Erdos-Turan condition). Furthermore, we also have 

lim = 



in 7 
rwoo — a 



Combining both results shows that when y! > almost everywhere on [a, b] we have 
lim \P n {z)\ 1/n = T^exp (- Aog — !— r dy e (x) 

n^oo b — a \ Ja \Z — X\ 



When z is on the interval [a, b] then the right hand side is equal to 1, but when z moves away from 
[a, b], then the right hand side becomes > 1. On the equipotential curves 

C r = {z G C \ [a, b] : exp ( - / log — - — r d/j, e (x) ) = r} 

b-a V Ja \z-x\ J 

with r > 1 we then conclude that 

lim |/(z)-%^|i/« = 4 > 
showing that we have exponential convergence. 
2 Hermite-Pade approximation 

Hermite-Pade approximation is simultaneous rational approximation to a vector of r functions 
fi, f2, • • • , fr, which are all given as Taylor series around a point a G C and for which we require 
interpolation conditions at a. We will restrict our attention to Hermite-Pade approximation around 
infinity and impose interpolation conditions at infinity. 

2.1 Definition 

Suppose we are given r functions with Laurent expansions 



oo 

C k ,j 



z 

k=0 



There are basically two different types of Hermite-Pade approximation. First we will need multi- 
indices n = (m, ri2, • • • , n r ) £ N r and their size \n\ = n± + + ■ ■ ■ + n r . 

Definition 2.1 (Type I). Type I Hermite-Pade approximation to the vector (fi,---,f r ) near 
infinity consists of finding a vector (-4^ i, • • • , A^^) of polynomials and a polynomial B^, with A^j 
of degree < nj — 1, such that 

j^A^f^z) - B R {z) = O (-M , z ^ cx). (2.1) 

.7 = 1 
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In type I Hermite-Pade approximation one wants to approximate a linear combination (with 
polynomial coefficients) of the r functions by a polynomial. This is often done for the vector of 
functions f,f 2 ,...,f r , where / is a given function. The solution of the equation 

r 

Y / A Htj (z)f(z)-B ri (z) = 
i=i 

is an algebraic function which gives an algebraic approximant / for the function /. 

Definition 2.2 (Type II). Type II Hermite-Pade approximation to the vector (fi,...,f r ) near 
infinity consists of finding a polynomial P^ of degree < \n\ and polynomials Qhj (j = 1>2, . . . ,r) 
such that 

Pn(z)fl{z)-Q Htl { Z ) = 0(-^ T A, Z^OO 

; (2.2) 

Pn{z)fr{z) - Qn,r{z) = (~^+T^ Z ^ CO. 

Type II Hermite-Pade approximation therefore corresponds to an approximation of each func- 
tion fj separately by rational functions with a common denominator P^. Combinations of type I 
and type II Hermite-Pade approximation are also possible. 

2.2 Orthogonality 

When we consider r Markov functions 

„ , , f bj duj(x) 

fM= / Jf2}U -> j = l,2,...,r, 

Jaj Z-X 

then Hermite-Pade approximation corresponds again to certain orthogonality conditions. 

First consider type I approximation. Multiply (|2.1jl by z k and integrate over a contour T 
encircling all the intervals [dj, bj] in the positive direction. Then 



dz, 



where the bfi£ are the coefficients of the Laurent expansion of the left hand side in (|2.1j) . Cauchy's 
theorem implies 

Furthermore, there is only a contribution on the right hand side when £ = fe+1, so when k < \n\ — 2, 
then none of the terms in the infinite sum has a contribution. Therefore we see that 



< k < \n\ - 2. 
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Now each /,■ is a Markov function, so by changing the order of integration we get 



Since T is a contour encircling [a,j, bj] we have that 



dz = x Afij(x), 



2tri 7 r z — x 
so that we get the following orthogonality conditions 



y~] / x k A H j(x)dfij(x) = 0, k = 0, 1, ...,|n| -2. 

7 =1 



(2.3) 



These are |n| — 1 linear and homogeneous equations for the \n\ coefficients of the r polynomials 
iflj (j = 1,2, ...,r), so that we can determine these polynomials up to a multiplicative factor, 
provided that the rank of the matrix in this system is \n\ — 1. If the solution is unique (up to a 
multiplicative factor), then we say that n is a normal index for type I. One can show that this 
is equivalent to the condition that the degree of each A^j is exactly rij — 1. Once the polynomial 
vector (Aft i, . . . , A^ r ) is determined, we can also find the remaining polynomial B^ which is given 
by 

Bn(z) = ± T A ^) -M^) dlXj{x y {2A) 
j=l ° a i 

Indeed, with this definition of B^ we have 

j^MiWM*) ~ B n( z ) = E r d ^ {x) - (2 ' 5) 

j=l j=l Ja i Z X 

If we use the expansion 

CO 

then the right hand side is 



X K 

z — x 1 — ' z k+1 

k=0 



E ^TT E / %k A *j ^ d »i ^ ' 

k=0 j=\ Ja i 

and the orthogonality conditions ()2.3j) show that the sum over k starts with k = \n\ — 1, hence 
the right hand side is 0(z~^), which is the order given in the definition of type I Hermite-Pade 
approximation . 

Next we consider type II approximation. Multiply (|2.2() by z k and integrate over a contour T 
encircling all the intervals [a,-, bj]. Then 

_E 5 ^i/ r 



z k - e dz, 



i= ni +i 
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where the bftji are the coefficients in the Laurent expansion of the left hand side of (|2.2j) . Cauchy's 
theorem gives 

I r 

Z k Qn,j{z) dz = 0, 



1 

27d 



r 



and on the right hand side we only have a contribution when £ = k + 1. So for k < nj — 1 none of 
the terms in the infinite sum contribute. Hence 



^- J z k P H {z)f j {z) dz = ^ < k < rij — 1. 



Interchanging the order of integration on the left hand side gives the orthogonality conditions 

rh 

x k Pfi(x) dni(x) = 0, k = 0, 1, . . . , m — 1, 

ai 

i (2.6) 

" b r 



/ x k Pfi{x) d/j, r (x) = 0, k = 0, 1, ...,n r 

J a r 



This gives \n\ linear and homogeneous equations for the |n| + 1 coefficients of P^, hence we can 
obtain the polynomial Pf L up to a multiplicative factor, provided the matrix of coefficients has rank 
|n|. In that case we call the index ft normal for type II. One can show that this is equivalent to 
the condition that the degree of P^ be exactly \n\. Once the polynomial P^ is determined, we can 
obtain the polynomials Qaj by 

<m*) = T Ph{z) ~ Ph{x) d ^ x )- ( 2 - 7 ) 



z — X 
J 



Indeed, with this expression for Qft,j we have 



P,( -KDi-) -(),.,(-) - I ' ^Qdnj(x), (2i 

a, z x 



and if we expand l/(z — x), then the right hand side is of the form 

00 1 rb. 



1 f i 

^3TT / x k P H {x)d H {x) 

k=0 Ja i 



and the orthogonality conditions ()2.6|) show that the infinite sum starts at k = nj, which gives an 
expression of 0(z~ nj ~ 1 ), which is exactly what is required for type II Hermite-Pade approximation. 

2.3 Angelesco systems 

Angelesco introduced an interesting system about which more can be said. 

Definition 2.3. An Angelesco system (/i, /2, . . . , f r ) consists of r Markov functions for which the 
intervals (aj,bj) are pairwise disjoint. 

All multi-indices are normal for type II in an Angelesco system. We will prove this by showing 
that the multiple orthogonal polynomial P^ has degree exactly equal to \n\. In fact more is true, 
namely: 
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Theorem 2.1. If (f\, . . . ,f r ) is an Angelesco system with measures fij that have infinitely many 
points in their support, then Pg has nj simple zeros on (aj, bj) for j = 1, . . . , r. 

Proof. Let x\, . . . , x m be the sign changes of P^ on (aj, bj). Suppose that m < nj and let ir m (x) := 
(x — x\) ■ ■ ■ (x — x m ). Then P^ m does not change sign on \aj,bj\. Since the support of /ij has 
infinitely many points, we have 

Pn(x)lTm(x) dflj(x) ^ 0. 

However, the orthogonality (|2.6j) implies that P^ is orthogonal to all polynomials of degree < — 1 
with respect to the measure [ij on [aj,bj], so that the integral is zero. This contradiction implies 
that m > nj, and hence P^ has at least nj zeros on (aj,bj). This holds for every j, and since the 
intervals (aj,bj) are disjoint this gives at least \n\ zeros on the real line. But the degree of is 
< \n\, hence P^ has exactly nj simple zeros on (aj,bj). □ 

The polynomial P^ can therefore be factored as 

Pn(x) = q m (x)q n2 (x) ■ ■ ■ q nr (x), 

where each q n . is a polynomial of degree nj with its zeros on (aj, bj). The orthogonality ()2.6|) then 
gives 

x k q nj (x) \\q nr (x) dnj(x) = 0, k = 0, 1, . . . , nj - 1. (2.9) 

The product Yli^jq ni (x) does not change sign on (aj,bj), hence (|2.9j) shows that q nj is an or- 
dinary orthogonal polynomial of degree nj on the interval [aj,&j] with respect to the measure 
Wi-^j kn t (2;)| dfij(x). The measure depends on the multi-index n. 

2.4 Algebraic Chebyshev systems 

A Chebyshev system {ip\, . . . , ip n } on [a, b] is a linearly independent system of n functions such that 
every nontrivial linear combination Y^k=l a kfk has at most n — 1 zeros on [a, b\. This is equivalent 
to the condition that 

/Vi(xi) <pi{x 2 ) ■■■ <pi{x n )\ 

ip 2 (xi) (p 2 {x2) ■■■ <P2(Xn) 



det 



\(p n (xi) (f n (x2) ■•• <Pn(x n ) J 

for every choice of n distinct points x\, . . . ,x n £ [a,b\. Indeed, when such that the 

determinant is zero, then there is a linear combination of the rows that gives a zero row, but this 
means that for this linear combination X^fc=i a kfk has zeros at x±, . . . , x n , giving n zeros, which is 
not allowed. 

Definition 2.4. A system (fx, . . . , f r ) is an algebraic Chebyshev system (AT system) for the index 
n if each fj is a Markov function on the same interval [a,b] with a measure Wj(x) d^(x), where /i 
has infinite support and the Wj are such that 

{w\,xwi, . . . ,x ni ~ l wi,w 2 ,xw2, ■ ■ ■ ,x n2 ~ 1 w 2 , . . . , 

w r ,xw r , . . . ,x rir ~ 1 w r } (2.10) 

is a Chebyshev system on [a,b]. 
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Theorem 2.2. Suppose n is a multi-index such that (f\, . . . , f r ) is an AT system on [a, b] for every 
index rn for which mj < nj (1 < j < r). Then Pfi has \n\ zeros on (a,b) and hence n is a normal 
index for type II. 

Proof. Let x%, . . . , x m be the sign changes of P^ on (a, b) and suppose that m < \n\. We can then 
find a multi-index rh such that |m| = m and mj < nj for every 1 < j < r and < for some 
1 < k < r. Consider the interpolation problem where we want to find a function 

r 

where qj is a polynomial of degree rrij — 1 if j ^ k and a polynomial of degree that satisfies 

L(xj) = 0, j = l,...,m, 

L(xq) = 1, for some other point xq £ [a,b]. 

The function L is a linear combination of 

{wi,xw\, . . . , x mi ~ 1 wi, . . . , Wk,xwk, • • • , x nk Wk, ■ ■ ■ w r ,xw r , . . . , x" lr ~ 1 w r } 

and this is, by assumption, a Chebyshev system. This interpolation problem has a unique solution 
since it involves a Chebyshev system of basis functions. The function L has, by construction, m 
zeros and the Chebyshev system has m + 1 basis functions, so L can have at most m zeros on [a, b] 
and each zero is a sign change (see, e.g., [23, pp. 20-21]). Hence P^L does not change sign on [a, b]. 
Since [i has infinite support, we thus have 




L{x)P H {x) dfi(x) / 0. 



But the orthogonality (|2.6|) gives 




Qj(x)Pn(x)wj(x) dfj,(x) =0, j = 1,2, ... ,r, 



and this contradiction implies that P^ has |n| simple zeros on (a,b). □ 

We have a similar result for type I Hermite-Pade approximation: 

Theorem 2.3. Suppose n is a multi-index such that (/i, . . . , f r ) is an AT system on [a, b] for every 
index m for which mj < nj (1 < j < r). Then X^=i ^n,j w j has \n\ — 1 zeros on (a, b) and n is a 
normal index for type I. 

Proof. Let x±, . . . , x m be the sign changes of Y^j=i AfijWj on (a, b) and suppose that m < \n\ — 1. 
Let ir m be the monic polynomial with these points as zeros. Then 7r m Xw=i -^ft,j w j does not change 
sign on [a, b] and hence 

rb r 

(x) ^ A n,j(x)Wj(x) dfl(x) / 0. 
Ja 3=1 

But the orthogonality conditions ()2.3j) indicate that this integral is zero. This contradiction implies 
that m > \n\ — 1. The sum Y^j=x A fi,3 w j 1S a nnear combination of the Chebyshev system (|2.10|) . 
hence it has at most \n\ — 1 zeros on [a, b]. Therefore we see that m = \n\ — 1. To see that the 
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index ft is normal for type I, we assume that for some k with 1 < k < r the degree of A^ & is less 
than nk — 1. Then X]j=i ^fi,j w j is a linear combination of the Chebyshev system (|2,1U|) from which 
the function x rik ~ 1 Wk is removed. This is still a Chebyshev system by assumption, and hence this 
linear combination has at most |n| — 2 zeros on [a, b]. But this contradicts our previous observation 
that it has \n\ — 1 zeros. Therefore every A^j has degree exactly rij — 1, so that the index n is 
normal. □ 



2.5 Nikishin systems 

A special construction, suggested by Nikishin 12S>j . gives an AT system that can be handled in some 
detail. The construction is by induction. A Nikishin system of order 1 is a Markov function 
/l l for a measure [i\ on the interval [ax, b\\. A Nikishin system of order 2 is a vector of Markov 
functions (/i,2,/2,2) on [02,62] such that 




where fi t i is a Nikishin system of order 1 on [a%, b\] and (a±, b±) n (02, 62) = 0- In general we have 

Definition 2.5. A Nikishin system of order r consists of r Markov functions (fi ir , ■ ■ ■ , fr,r) on 
[a r , b r ] such that 

hAz) _ /"*t«, 

Ja r Z-X 

fjA Z ) = / Jj-i,r-i{x) — - — , J = 2, ...,r, 

J a r z x 

where {f\ r _i, . . . , f r -i r _i) is a Nikishin system of order r— I on [a r _i, 6 r _i] and (a r , 6 r )n(a r _i, 6 r _i) = 
0. 

For a Nikishin system of order r one knows that the multi-indices n with n\ > rt2 > • • • > n r 
are normal (the system is an AT-system for these indices), but it is an open problem whether every 
multi-index is normal (for r > 2; for r = 2 it has been proved that every multi-index is normal). 

What can be said about type II Hermite-Pade approximation for r = 2? Recall ()2.8|) for the 
function / 12 : 

Pn u n 2 (y)fl,2(y) ~ Qni,n 2 ;l(y) = [ Pni ' n2 ^ dfJ, 2 (x). 

Multiply both sides by y k , with k < n\. Then the right hand side is 



f 2 (y x )-fni,n2 i x ) j / \ 1 f 2 x ^Pni,n2 ( x ) j r \ 
\ ■ d/J,2{X) + / ' d/J.2(X). 

Ja 2 y~ x Ja 2 y~ x 

Clearly (y k — x k )/(y — x) is a polynomial in x of degree k — 1 < ri\ — 1 hence the first integral on 
the right vanishes because of the orthogonality (|2.6|) , Integrate over the variable y G [fli,6i] with 



(2.11) 
(2.12) 



f 2 y Pfii,n2 i x ) j / \ 

/ ; dfi 2 (x) 

Ja 2 y-x 
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respect to the measure n\. Then we find for k < n\ 



[ p ni,m{y)fiji(y) - Qm,n 2 ;i(y)]y k ^Mi(y) 
« 

b\ rb 2 x k p 



a\ J a 2 



Change the order of integration on the right hand side. Then 



y-x 



[Pn l7 n 2 (y)h,2(y) - Qm,n 2 ;i(y)]y ^i(y) 

r-b 2 

x k P nuri2 (x)f lt i(x) dn 2 (x) 

'a 2 

and this is zero for k < n 2 — 1. Hence if n 2 < n\ + 1 then the expression P ni ,n 2 fi,2 — Qm,n 2 ;i is 
orthogonal to all polynomials of degree < n 2 — 1 on [ai,6i]. This implies that P ni ,n 2 fi,2 — Qm,n 2 ;i 
has at least n 2 zeros on (ai, 6i) using an argument similar to what we have been using earlier. Let 
R n2 be the monic polynomial with n 2 of these zeros on (ai, b\). Then [P ni ,n 2 fi,2 — Qm,n 2 ;i]/Rn 2 is 
an analytic function on C \ [02, b 2 ], which has the representation 

Pn u n 2 (y)fl,2{y) - Qni,n 2 ;l(y) _ ' Pn u n 2 (x) rf /a 



R n2 (y) R n2 {y) Ja 2 y-x 

Multiply both sides by y k and integrate over a contour T encircling the interval [02,62] in the 
positive direction, but with all the zeros of R n2 outside T. Then 



1 f k P ni,n a {y)flM ~ Qni,n 2 ;l(j/) rf 

2vri J r Rn 2 {y) 

2ni J T R ri2 (y) y-x 

If we interchange the order of integration on the right hand side and use Cauchy's theorem, then 
this gives the integral 

dpL 2 (x) 



1 f 2/ Pni,n 2 {x) , , s , 

■d/i 2 {x)dy. 



f k 

/ x P ni ^ n2 {x) 



i,i 2 Rn 2 {x) 

By the interpolation condition (|2.2|) . the integrand on the left hand side is of the order 0(y k ~ ni ~ n2 ~ l ). 
so if we use Cauchy's theorem for the exterior of T, then we see that the integral vanishes for 
k < ni + n 2 — 1. Hence we get 

2 x k P nun2 (x)^^\ = 0, fc = 0,l,...,m + n 2 -l. (2.13) 

1 a 2 ^n 2 \X) 

This shows that P ni m is an ordinary orthogonal polynomial on [a 2 , b 2 ] with respect to the measure 
dfi 2 /R n2 . Observe that (01,61) n (02,62) = implies that R n2 does not change sign on [02,62]. 
Finally we have 

" h2 J'm.naO*) d^{x) _ . .P num (x) - P ni , m (y) d^ 2 (x) 



r^wM = r Pnin2{x y 

Ja 2 y — x Rn 2 (%) J a 2 



y-x Rn 2 (x) 



+ P ni ,n 2 (y) 

J a 



rb 2 

Prn,n 2 (y) / 
J a 2 



a-, y x Rn 2 { x ) 

° 2 Pn u n 2 (x) d(JL 2 (x) 



y-x R n2 (x) ' 
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since [Pm^iu) ~ Pni,n 2 ( x )]/(y — x) is & polynomial in x of degree n\ + n 2 — 1 and because of the 
orthogonality (|2.13jl . Hence 



Pn u n 2 (y)fl,2(y) ~ Qni,n 2 ;l(y) 



RnM f b2 Pnun,( X ) d» 2 (x) 



Pnx,n2\y) 



a 2 



y - x R n2 (x) ' 



(2.14) 



Both sides of the equation have zeros at the zeros of Rn 2 , but there will not be any other zeros on 
[ai,6i] since the integral on the right hand side has constant sign. 

2.6 Asymptotic properties and convergence 

We restrict ourselves to the case r = 2, but the general case r > 1 can be treated in a similar way 
(with a bit more work). The asymptotic properties of the multiple orthogonal polynomials and the 
convergence of the Hermite-Pade approximants are handled by trying to put everything into terms 
of ordinary orthogonal polynomials. 

2.6.1 Angelesco systems 

The type II multiple orthogonal polynomial can be factored as P nilV , 2 = q n -iq n -2, where q ni has 
n\ zeros on (ai,&i) and q n2 has n 2 zeros on (a 2 ,b 2 ). From (|2.8j) we get 



1 



Qrii ,rt2;l \%) 
Pni,ri2{ z ) qni{z)qri2( z ) J a\ Z — X 



mi \ 



q n2 (x) dm(x). 



We saw that q ni is an orthogonal polynomial of degree n\ on [a±, b\] for the measure |g n2 (x)| dfi\(x) 
so we can write 



l>2 



a 1 



q ni {x) 



z — x 



q n2 (x) dm (x) 



1 f bl qlM 



Qni \ z ) J a\ z x 



as we did earlier in Section 1.6. This gives 

Qni,n2\li. z ) _ 1 



a-2 „2 



Pni,n2( z ) qn 1 i z )qn2i z ) J a\ z x 

From here we get the estimate 



q n2 (x) d[i\ (x) 



q n2 {x) dfii(x). 



AC*) 



Qni,ri2;l( z ) 



Pn\,n2 i z ) 



< 



\q ni ( z ) 1 2 1 q n2 ( z ) I di J ai 



q 2 nA X ) \ln 2 {x)\dm{x) 



where d\ is the distance between z and [01, b\[. If P ni n2 is normalized so that it is monic, then we 
can take both q ni and q n2 monic and we get 



Qni,ri2;l \ z ) 



Pn\,n2 ( z ) 



< 



d l7n 1 ;lkn 1 (^)| 2 kn 2 (^r 



where 



q 2 ni (x) \qn 2 (x)\dfi 1 (x) 



mm 



7Tni( X ) \qn 2 {x)\dfJ, 1 (x). 



(2.15) 



80 



Walter Van Assche 



A similar reasoning holds for the rational approximation to f 2 and gives 

1 



h\ z ) 



-fni,n 2 ( z ) 

where d 2 is the distance of z to [02, 62] and 

1 



< 



<y 2 I 

'ni;2 J a,2 



b 2 

ll 2 ( x ) \q ni (x)\dfj, 2 (x) 



min / irl (x)\q ni (x)\dfj, 2 (x). (2.16) 

TTn 2 (x)=X n 2+...J 



2" -J/ T J 0,2 



The convergence of these rational approximants is therefore given in terms of the asymptotic be- 
havior of |g ni (z)|, |on 2 (z)| and the constants 7m;i and 7n 2 ;2- These polynomials (and their zeros) 
interact with each other: the polynomial q ni is an orthogonal polynomial for a measure that con- 
tains factor, and q n2 is an orthogonal polynomial for a measure that contains 

qn 1 as a 

factor. Let 

j 711 ^ "2 

1 3=1 2 3=1 

where the zeros of g ni and yj,n 2 are the zeros of q n2 . Then (y ni -i) is a sequence of probability 

measures on [01,61] and (v n2 ;2) i s a sequence of probability measures on [02,62]. Helly's selection 
principle guarantees that there are weakly converging subsequences with limits v\ on [01, 61] and v 2 
on [02,62]. The minimization problems (|2,15[) and (|2,16[) lead to an extremal problem in potential 
theory for two probability measures. The integral in (j2.15j) is approximately of the form 

rh 

/ exp [— 2niU(x;u 1 ) — n 2 U(x;u 2 )] dfii(x) 

J a± 

where U(x; v) is the logarithmic potential of v 

U(x;u)= / log, — - — rdv(y), 
J \x-y\ 

and the integral in (|2.16|) is approximately of the form 

,-\,2 



ro 2 

I exp [—2n 2 U(x; v 2 ) - niU(x; vi)] dfj, 2 (x). 

J 0,2 



We want to minimize both integrals over all pairs of probability measures (fi, fa), where the first 
measure is supported on [01,61] and the second measure on [02,62]. If n\/(n\ + n 2 ) — > p and 
1^2/ (ni + n 2 ) — ► q (so that p + q = 1), and if the measures [i\ and fi 2 are sufficiently regular (e.g., 
fi^ > almost everywhere on [ai, 61] and fi' 2 > almost everywhere on [02, 62]) then the solution of 
the extremal problem satisfies 

2pU(x; vi) + qll(x; v 2 ) = £1, x G supp(i/i) C [01, 61], ( 2 -17) 
pU(x;ux) + 2qU(x;v 2 ) = £ 2 , x G supp(z/ 2 ) C [a 2 , fa]. (2.18) 

where the £j are constants that act as Lagrange multipliers. For this extremal problem it is possible 
that the support of v\ is not the full interval [ai, 61] and the support of u 2 can be a subset of [02, 62]. 
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This is a consequence of the interaction: the zeros of q ni are repelling the zeros of q n2 and vice 
versa. The variational conditions ()2.17j) - (|2.18j) have to be supplemented with 



2pU(x;u 1 ) + qll(x;v 2 ) > £ lt 
pU(x;v 1 ) + 2qU(x;v 2 ) > £ 2 , 



x G [01,61] \ supp(z^i), 
x £ [02,62] \supp(i/ 2 ). 



The Lagrange multipliers £±,£2 appear in the asymptotics of 7 nr i and 7 n2 ;2 as 



lim 

n\+n2— > 00 



2/(m+n2) in \ r 2/(ni+ri2) 



exp{t 2 )- 



Our conclusion is that the convergence to first function f\ is determined by level curves Cy 
{z : exp[2p[7 (z; V\) + qU(z; v 2 ) — £\\ = r} with r < 1 on which we have 



lim 

ni+n2^oo 



Qni,7i2;l( 2 



Pni,n 2 \ z ) 



l/(ni+n 2 ) 



and the convergence to the second function / 2 is determined by level curves D r = {z : exp[pU (z; 
2qU(z; v 2 ) — £2] = r } with r < 1 on which we have 



lim 

ni+ri2^oo 



f2(z) 



Qni,nr,2\ z ) 



l/(ni+n 2 ) 



r. 



Observe that supp(^i) C C\ and supp(^ 2 ) C D\, so we don't expect exponential convergence on 
these sets. On the remaining part of [ai, 61] (and [a 2 , 6 2 ]) we get values r > 1, so we get even worse 
behavior there. This is caused by the fact that on these parts of the intervals there will not be 
enough zeros of the multiple orthogonal polynomial to simulate the singularities of the functions 
fi and f 2 . 

2.6.2 Nikishin systems 

The analysis for Nikishin systems is similar but leads to a different extremal problem for potentials. 
We now start from IJ2.14J) which gives 



Qn u nr,l(y) 



Pni,ri2 \V) 



< 



\RnM\ 1 r b2 



\ P ni,n 2 (y)\ 2 d 2 



f 2 

/ Pni,n2\ x ) 

J CL2 



\Rn 2 {x)\ ' 



(2.19) 



where d 2 is the distance from y to [a 2 ,6 2 ]. Now we have that P ni , n2 is a (monic) orthogonal 
polynomial on [02,62] for the measure dfjL 2 /\Rn 2 \, so we have 



7 2 

in\,n 2 



62 



P 



Til ,712 



(1 2 



\R n2 {x) 



[*» 2 0> 2 (s) 
mm / no\ x ) TT; — TTT- 



(2.20) 



The polynomial R n2 has its zeros on [01, 61] and in fact is a monic orthogonal polynomial on [01, 61] 
for the measure 

Pni,n 2 f 1,2 Qni,ri2\l 



R 



n 2 
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Indeed, we can verify that 



y Rn 2 {y) p i ? rf/il W 



y [ P ni,n 2 (y)fl,2(y) - Qn u n 2 ;l(y)] d/Xi(y) = 0, 



«1 



for < n 2 — 1, since we have seen that the expression P ni ,n 2 fi,2 — Qm,n 2 ;l is orthogonal to all 
polynomials of degree less than n 2 on [ai,6i] for the measure The orthogonality measure for 



R n2 can also be written as 



Pn 1 ,n 2 (y)flAy) - Qni,n 2 ;l(y) 1 f b2 P n u n 2 ( x ) d/J, 2 (x) 



1 a 2 y x Rn 2 {x) 

In this weight we have 

1 < f b2 P Ln 2 (x) dfl 2 ( X ) < 1 



Tn^naCl Ja 2 \y ~ A \R n2 {x)\ H x ,n 2 C 2 

where C\ and C 2 are the maximum and minimum, respectively, over the set 

{\x - y\ : x 6 [o 2 , 6 2 ], y€ [ai, 61]}. 
So, up to the constants C\, C 2 , we have the extremal problem 

1 - '"RIM 



7n 2 ;2 Ai 2 |^ni,n 2 (y)l 

11 



rbl - , d/xi(y) 



Define the zero distributions 

n ni+n 2 .. n 2 

i=i 3=1 

where Xj ini+n2 are the zeros of P ni ,n 2 and yj lH2 are the zeros of R n2 . Then (f„ 1+n2 ) is a sequence of 
probability measures on [02,62] and {y n2 ;2) is a sequence of probability measures on [ai,6i]. Helly's 
selection principle shows that there are weakly convergent subsequences with limits v and v 2 which 
are supported on [02,62] and [ai,6j] respectively. The extremal problems (|2.2U|) and ()2.21|) then 
lead to an extremal problem in potential theory. The integral in (|2.2U|) is approximately 

rb 2 

/ exp[-2(ni + n 2 )U(x; v) + n 2 U(x; v 2 )] d/i 2 (x) 

J a 2 

and the integral in (|2.21j) is approximately 

rbi 

exp[-2n 2 U(x; v 2 ) + (rti + n 2 )U(x; u)] dfii(x). 

ai 
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If nij{n\ +ri2) — * q and fjL > almost everywhere on [cij, 6j] (i = 1,2), then this gives the variational 
conditions 



2U{x;u) - qU(x;u 2 ) = £\, x G supp(^) C [a 2 , 62], 
-U (x; v) + 2qU (x, v 2 ) = £2, x G supp(i/ 2 ) C [ai, 61], 



where £1 and £ 2 are Lagrange multipliers for which 



lim 7^fc +n2) = exptfO, Jim t^ 1 "^ 



ni+n2— >oo 

Looking back to (|2.19j) we thus have 



exp(^ 2 



lim 

rti+ri2— >oo 



/i, 2 (y) 



Qni,n 2 ;l(y) 



Pni,n2 ill) 



l/(ni+n 2 ) 



r < 1 



(2.22) 
(2.23) 



on level curves C r := {z : exp[2U(z; v) — qU (z; 1/2) — £1] = r}. 

The convergence to the second function / 2 2 can also be handled but is left as an advanced 
exercise for the reader. 
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3 Applications 

3.1 Gauss and simultaneous Gauss quadrature 

Gauss quadrature is directly related to orthogonal polynomials, and hence to Pade approximation. 
Here is an approach based on complex analysis. Suppose /x is a positive measure on [a, b] and we 
denote by / the Markov function for n, 



z — X 



Let Q n -\/P n be the Pade approximant to / near infinity. Then 

M ~ = 0(z-^), z - oo. 

Multiply both sides by a polynomial H2n-i of degree at most 2n — 1, and integrate along a contour 
r encircling the interval [a, b] once in the positive direction. Then 

h f r ^i(z)f(z)dz = ± / r ^n-l(*)^y- dz, 

because the remainder term vanishes after integration, due to Cauchy's theorem for the outside of 
r. Interchanging the order of integration on the left hand side and using the residue theorem on 
the right hand side shows that for every polynomial i^2n-\ of degree < 2n — 1 we have 



■J a 



,nT^2n—l{xj,n) ■, (3-1) 

3=1 



where Aj iTl is the residue of the Pade approximant at the zeros x^ n of P n , i.e. 

Aj.n 



Qn— 1 (Xj,ri) 



Pn{ x j,n) 

If we take TT2 n -i(x) = P%(x)/(x — x Jjn ) 2 , then gives 

P n 2 (x) 



a i x x j,n)' 



dfj,(x) = \^ n [P' n {x^ n )f 



which shows that A Jjn > for j = 1, . . . ,n. These weights \j )7l are known as Christoffel numbers 
or Gauss quadrature coefficients, the zeros Xj iTl of P n are Gauss quadrature nodes, and is the 
Gauss quadrature formula. Replacing i^2n-\ by a continuous function g on [a, b], suggests to use 
the sum 

n 
3=1 

as an approximation to the integral 

b 

g{x) dfj,(x). 

If [a, b] is a finite interval, then every continuous function can be approximated uniformly by 
polynomials (Weierstrass), hence the quadrature sum indeed converges to the integral when the 
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number of nodes n tends to infinity. The positivity of the weights Aj jn is needed to get this 
convergence. The quadrature formula requires n function evaluations (at the zeros of P n ) and is 
exact for polynomials of degree < 2n — 1, hence on a linear space of dimension 2n. The ratio 
n/2n = 1/2 is a measure for the efficiency of this formula. 

In a number of applications we need to approximate several integrals of the same function, but 
with respect to different measures. The following example comes from Suppose that g is the 
spectral distribution of light in the direction of the observer and w±, W2,W3 are weight functions 
describing the profiles for red, green and blue light. Then the integrals 

g{x)w\{x)dx, I g(x)w2{x)dx, / g(x)ws(x) dx 
o Jo Jo 

give the amount of light after passing through the filters for red, green and blue. In this case 
we need to approximate three integrals of the same function g. We would like to use as few 
function evaluations as possible, but the integrals should be accurate for polynomials g of degree 
as high as possible. If we use Gauss quadrature with n nodes for each integral, then we require 3n 
function evaluations and all integrals will be correct for polynomials of degree < 2n — 1 (a space of 
dimension 2n). This gives an efficiency of 3/2. In fact, with 3n function evaluations we can double 
the dimension of the space in which the formula is exact. Consider the Markov functions 

/,<-->= ,=1,2,3 

JO z — X 

and the type II Hermite-Pade approximation problem 

fjiz) ~ = 0(z-^), z^oo. 

Now we can multiply by a polynomial n^n-i of degree at most An — 1, and integrate along a contour 
r encircling [0, 2n] in the positive direction, to obtain 



L 



2tt 3n 

ll An -l{x)Wj(x)dx = ^Xk,n;j9{Xk,n), .7 = 1,2,3, (3.2) 

k=l 



where Xk, n are the zeros of P n ,n,n and Xk,n;j are the residues of Q n ,n,n;j/Pn,n,n at the zero Xfc >n : 

,n J 

' n;j ~ ~P> IxZ ) ' 

Therefore the three integrals will be evaluated exactly by the three sums in ()3.2j) for polynomials 
of degree < An — 1. The convergence is somewhat more difficult to handle, since we do not 
have a general result that the quadrature coefficients Xk,n;j are positive. The positivity has to be 
investigated separately for Angelesco and Nikishin systems. See ^JJ El QZi for finding out more 
about simultaneous Gauss quadrature. 



3.2 Irrationality and transcendence 

Hermite-Pade approximants were introduced by Hermite in his proof that e is transcendental. 
Various irrationality proofs of famous mathematical constants use Hermite-Pade approximation, 
even though this may not always be obvious. Proving irrationality can be done by constructing 
good rational approximants: 
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Lemma 3.1. Let Suppose we can find sequences of integers (p n ), (q n ) such that 

1. q n x — p n ^ for all n G N, 

2. lim n _ >00 (g n x - p n ) = 0. 
Then x is irrational. 

Proof. Suppose that x is rational. Then x = p/q for some integers p, q. We then have 

_ _ q n p -p n q 
q 

and since this is not zero for every n, we see that q n p — p n q ^ for all n. But since these are 
integers, this implies that \q n p — Pn,q\ > 1 for all n. This shows that \q n x — p n \ > 1/q, which is in 
contradiction with condition 2 in the lemma. Hence we must conclude that x is irrational. □ 

The construction of the sequences p n and q n often uses Pade or Hermite-Pade approximation 
for well chosen functions. As an example, consider the two Markov functions 

/■«=/'—, AM-/"-*-. 
JO z — x J-X z — X 

which form an Angelesco system. Some straightforward calculus gives 

1 77f I 77f 

/ 1 (,) = --log2- T , /2 (i) = -log2- T , 

hence the sum gives fi(i) + /2W = —iir/2. The type II Hermite-Pade approximants for fx and /2 
will give approximations to ir. Recall that 

Pn,n{ z )h{z)-QnM z ) = [ dx 

Jo z — X 

Pn,n( Z )h( Z ) ~ Qn,n;2(z) = [ ^'"^ dx. 

7-1 z-x 

Summing both equations gives 

Pn,n( Z M( Z ) + h( z )\ ~ [Qn,nA z ) + Qn,nM = ^ dx - 

J-X Z-X 

So the fact that we are using a common denominator comes in very handy here. Then we evaluate 
these expressions at z = i and hope that P n . n {i) and Q n ,n;l(i) + Qn,n;2(i) are (up to the factor i) 
integers or rational numbers with simple denominators. Conditions 1 and 2 in Lemma 13.11 can be 
checked by using asymptotic properties of Hermite-Pade approximation. For this particular case 
the type II multiple orthogonal polynomials are given by a Rodrigues formula 



d n 

Pn,nix) = ^ (x»(l - X 8 )") , 



and these polynomials are known as Legendre- Angelesco polynomials. They have been studied 
in detail by Kalyagin |22| (see also |S2])- The Rodrigues formula in fact simplifies the asymptotic 
analysis, since integration by parts now gives 



1 PnJx) . f 1 , _ .X n (l-X 2 ^ 



n,n 

z — X 7-1 V ' ( z — x) 



dx= I {-l) n n\ ^ ^ dx, 
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which can be handled easily. Some trial and error show that one gets better results by taking 2n 
instead of n, and by differentiating n times: 



d n 
dz 1 



(P2n,2n(z)[fl(z) + jfe^)] - [Q2n,2n;l(z) + Q2n,2n;2(z)}) , 



(3n)!(-i) 



n+l 



1 X 2n (l -X 

1 (1 + ix) 3n+1 



2\2n 



dx. (3.3) 



This gives rational approximants to tt of the form 



TT 



K„ 



+ 

OinCn Oir, 



where a n ,b n ,c n are explicitly known integers and K n is the integral on the right hand side of 
()3.3|) . The rational approximants show that tt is irrational (which was shown already in 1773 by 
Lambert), and they even show that you can't approximate tt by rationals at order greater than 
23.271 (Beukers 0), i.e., 



P 

TT 



< 



with r > 23.271 only has a finite number of solutions (p,q), where p and q are relatively prime 
integers. This upper bound for the order of approximation can be reduced to 8.02 (Hata [20]) by 
considering Markov functions f\ and fy, with 



dx 



z — x 



This /3 is now over a complex interval, and then Theorem 12. II concerning the location of the zeros 
no longer holds, and the asymptotic behavior must be handled by another method. 

One can also use Hermite-Pade approximants to prove transcendence. Then one uses the fol- 
lowing lemma, which extends Lemma 13 . 1 1 fr om irrational numbers to non-algebraic numbers. 

Lemma 3.2. Let Suppose that for every integer m £ N and for all integers Oq, a±, . . . , a m S 

Z we can find integers Po >n ,Pl,n> ■ ■ ■ >Pm,n such that 

1 - XX=o a kPk,n for all n G N, 

2. lim n ^ oo (p ,n^ fc - Pk,n) = for k = 1,2, ... ,m. 

Then x is transcendental. 

Proof. Suppose that x is algebraic. Then there exists an integer m and integers ao, . . . ,a m such 
that YlT=o a k xk = 0- But then 

m m 
^ a k (p 0)n x k - p ktn ) = ~y~] a k p k:n . 

k=0 k=0 

The right hand side is an integer different from zero, hence 



^a fc (p 0i n£ fc ~Pk,n) 



k=0 



> h 



for all n G N. But this contradicts condition 2 of the lemma. Hence we must conclude that x is 
not algebraic. □ 
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If we use type II Hermite-Pade approximation to (e lX , e 2X , . . . , e rX ) near x = 0, then this 
will give the transcendence of e. For Hermite-Pade approximation near x = we can use two 
multi- indices n = (m, n%, . . . ,n r ) and rh = (mi,m2, • • • , m r ). These Hermite-Pade approximants 
are known explicitly when rrij + rij = N + \n\ for 1 < j < r, where N is an integer. If we define the 
polynomial 

T(x) := x N (x - Xi) ni (x - A 2 )" 2 ■ ■ ■ (x - \ r ) n \ 
then T has degree N + \n\. The expression 

/>oo 

P n (z) = z \ H \ +N+l / T(x)e- 2a; dx 
Jo 

gives a polynomial of degree |n|, and 

/>oo 

Qm;j(z) = Z^ +N+1 / T(X + \j)e~ ZX dx 

Jo 

gives a polynomial of degree |fi| + iV — rij = rrij. One easily verifies that 

P n (z)e x i z - Q^z) = e ^ z z \^\+ N + 1 T(x)e~ zx dx = 0(z n ^ +1 ), 

Jo 

as z — > 0, which are the interpolation conditions for type II Hermite-Pade approximation near the 
origin for the two multi-indices (n, m) . 

For proving the transcendence of e, we take A,- = j, z = 1 and for a prime p > r, which is not a 
divisor of ao, we take N = p — 1 and rij = p (j = 1, . . . , r). Then some elementary calculus shows 
that po = Pfi{l)/{p — 1)! is an integer which is not divisible by p and each pj = Qm;j(l)/(,P — 1)' is 
an integer divisible by p. Therefore Sj=o a jPj i s n °t divisible by p and hence condition 1 of Lemma 
E21 is satisfied. Furthermore 

Po? 3 ~ Pj = .g, T{x)e x dx, 

and the simple estimate |T(x)| < j( r+1 )p -1 on [0, j], shows that this converges to for every j when 
the prime p tends to infinity (luckily Euclides showed that there are infinitely many primes). So 
condition 2 of Lemma ll-{.2l is also satisfied and we conclude that e is transcendental (Hermite, 1874). 

3.3 Other applications 

Recently a number of applications came up in other areas of mathematics and theoretical physics. 
There are interesting connections with random matrix theory, where multiple orthogonal polyno- 
mials (in particular multiple Hermite polynomials) appear when one investigates random matrices 
with an external source |SJ|5]. Multiple Laguerre polynomials appear for the Wishart ensemble of 
random matrices 0. Multiple Jacobi polynomials (the Jacobi-Piheiro polynomials) were used to 
obtain a counterexample to the Bethe Ansatz Conjecture for the Gaudin model j2S]- More details 
on multiple orthogonal polynomials (recursion relation, specific examples, etc.) can be found in 
[2U Chapter 23]. 
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